FEAT: Generalize Colloquial Wordswap Attack Converter #1348
Merged
romanlutz merged 12 commits intoAzure:mainfrom Feb 27, 2026
Merged
FEAT: Generalize Colloquial Wordswap Attack Converter #1348romanlutz merged 12 commits intoAzure:mainfrom
romanlutz merged 12 commits intoAzure:mainfrom
Conversation
Contributor
Author
|
@microsoft-github-policy-service agree |
romanlutz
reviewed
Feb 4, 2026
Contributor
romanlutz
left a comment
There was a problem hiding this comment.
Very nice! I am personally not convinced Singaporean should be the default BUT that's what we put in the issue description and it means current users won't be experiencing a breaking change. We may update that in the future, of course. Thanks a ton!
romanlutz
approved these changes
Feb 4, 2026
romanlutz
reviewed
Feb 4, 2026
Contributor
There was a problem hiding this comment.
Pull request overview
Generalizes ColloquialWordswapConverter to load colloquial word-substitution mappings from YAML files (with several new regional examples) while retaining support for direct in-code substitution dictionaries.
Changes:
- Add
wordswap_pathoption to load substitutions from YAML underpyrit/datasets/prompt_converters/colloquial_wordswaps/(defaulting tosingaporean.yaml). - Add new colloquial wordswap YAML datasets (Singaporean + multiple regional examples).
- Update and expand unit tests to cover YAML-based wordswaps and constructor argument conflicts.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
pyrit/prompt_converter/colloquial_wordswap_converter.py |
Loads substitutions from YAML via wordswap_path, adds conflict validation, and keeps custom-dict support. |
tests/unit/converter/test_colloquial_wordswap_converter.py |
Updates tests for YAML-based swaps and adds new scenarios/constructor validation tests. |
pyrit/datasets/prompt_converters/colloquial_wordswaps/singaporean.yaml |
Moves prior default substitutions into a dataset file. |
pyrit/datasets/prompt_converters/colloquial_wordswaps/filipino.yaml |
Adds Filipino example substitutions. |
pyrit/datasets/prompt_converters/colloquial_wordswaps/indian.yaml |
Adds Indian example substitutions. |
pyrit/datasets/prompt_converters/colloquial_wordswaps/southern_american.yaml |
Adds Southern American example substitutions. |
pyrit/datasets/prompt_converters/colloquial_wordswaps/multicultural_london.yaml |
Adds Multicultural London example substitutions. |
pyrit/datasets/prompt_converters/colloquial_wordswaps/multicultural_london.yaml
Outdated
Show resolved
Hide resolved
… files - Add wordswap_path parameter to load substitutions from YAML files - Keep custom_substitutions for backward compatibility - Add YAML files: singaporean, filipino, indian, multicultural_london, southern_american - Move hardcoded substitutions to singaporean.yaml (remains the default) - Use CONVERTER_SEED_PROMPT_PATH for consistent path handling - Make all __init__ params keyword-only for consistency with other converters - Add input validation: YAML format, value types, mutual exclusion of params - Catch yaml.YAMLError and re-raise as ValueError with file context - Include wordswap_path in component identifier for proper tracking - Add tests for YAML paths, error paths, and custom dict substitutions Closes Azure#418 Co-authored-by: taherakolawala <181370151+taherakolawala@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…om_substitutions
- Keep deterministic as positional arg (reverting keyword-only change)
- Empty custom_substitutions={} falls through to defaults as before
- Add test for empty custom_substitutions behavior
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…lize-colloquial-wordswapper
- Positional usage emits FutureWarning with migration guidance - Keyword usage works without warning - Add test for deprecation warning Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Verify FutureWarning is only emitted for positional usage, not keyword. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
romanlutz
approved these changes
Feb 26, 2026
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
In accordance with the Issue #418 the converter in
colloquial_wordswap_converter.pyhas been generalized to use different versions of colloquial word swaps.A new directory has been created in
pyrit/datasets/prompt_converterscalledcolloquial_wordswaps. This directory contains the original default Singaporean word substitutions as wells as a few different regional colloquial word swap YAML examples. TheColloquialWordSwapConverterclass now accepts a new parameter calledwordswap_pathduring initialization. It defaults tosingaporean.yamlhowever the argument can be filled with any YAML file located in thecolloquial_wordswapsdirectory mentioned before.In the same vein, if you want to add a new set of word substitutions, all you need to do is create a new YAML file in the same format as any of the others and add it to the
pyrit/datasets/prompt_converters/colloquial_wordswapsdirectory. Here is an example of how a YAML file must be formatted:The following is an example initialization of the converter with a non-default wordswapper YAML:
People of whom this PR may be of interest to: @eugeniavkim @romanlutz
Closes #418
Tests and Documentation
test_colloquial_wordswap_converter.pyfor the generalizable wordswap converter.test_colloquial_wordswap_converter.pyfor checking multiple word custom word swapper conversions.uv run pytest tests/unit/converter/test_colloquial_wordswap_converter.py